AITopics

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence (1.00)

Borzechowski, Florian, Schäfer, Michael, Schwarz, Heiko, Pfaff, Jonathan, Marpe, Detlev, Wiegand, Thomas

Optimizing Learned Image Compression on Scalar and Entropy-Constraint Quantization

arXiv.org Artificial IntelligenceJun-11-2025

The continuous improvements on image compression with variational autoencoders have lead to learned codecs competitive with conventional approaches in terms of rate-distortion efficiency. Nonetheless, taking the quantization into account during the training process remains a problem, since it produces zero derivatives almost everywhere and needs to be replaced with a differentiable approximation which allows end-to-end optimization. Though there are different methods for approximating the quantization, none of them model the quantization noise correctly and thus, result in suboptimal networks. Hence, we propose an additional finetuning training step: After conventional end-to-end training, parts of the network are retrained on quantized latents obtained at the inference stage. For entropy-constraint quantizers like Trellis-Coded Quantization, the impact of the quantizer is particularly difficult to approximate by rounding or adding noise as the quantized latents are interdependently chosen through a trellis search based on both the entropy model and a distortion measure. We show that retraining on correctly quantized data consistently yields additional coding gain for both uniform scalar and especially for entropy-constraint quantization, without increasing inference complexity. For the Kodak test set, we obtain average savings between 1% and 2%, and for the TecNick test set up to 2.2% in terms of Bjøntegaard-Delta bitrate.

artificial intelligence, machine learning, quantization, (17 more...)

doi: 10.1109/ICIP51287.2024.10648254

2506.08662

Country: Europe > Germany (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsMay-27-2025, 20:57:57 GMT

Causal Context Adjustment Loss for Learned Image Compression

In recent years, learned image compression (LIC) technologies have surpassed conventional methods notably in terms of rate-distortion (RD) performance. Most present learned techniques are VAE-based with an autoregressive entropy model, which obviously promotes the RD performance by utilizing the decoded causal context. However, extant methods are highly dependent on the fixed hand-crafted causal context. The question of how to guide the auto-encoder to generate a more effective causal context benefit for the autoregressive entropy models is worth exploring. In this paper, we make the first attempt in investigating the way to explicitly adjust the causal context with our proposed Causal Context Adjustment loss (CCA-loss).

autoregressive entropy model, causal context adjustment loss, learned image compression, (3 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.63)

arXiv.org Artificial IntelligenceMar-31-2025

Learned Image Compression and Restoration for Digital Pathology

Lee, SeonYeong, Seong, EonSeung, Lee, DongEon, Lee, SiYeoul, Cho, Yubin, Park, Chunsu, Kim, Seonho, Seo, MinKyung, Ko, YoungSin, Kim, MinWoo

L earned Image C ompressionand R estorationfor Digital Pathology Preprint, compiled A pril 2, 2025 SeonY eong Lee 1, EonSeung Seong 1, DongEon Lee 1, SiY eoul Lee 1, Y ubin Cho 1, Chunsu Park 1, Seonho Kim 1, MinKyung Seo 1, Y oungSin Ko 3, and MinWoo Kim 1,2,* 1 Department of Information Convergence Engineering, Pusan National University, Y angsan, Korea 2 School of Biomedical Convergence Engineering, Pusan National University, Y angsan, Korea 3 Seegene Medical Foundation, Seoul, Korea The first two authors contributed equally to this work. A bstract Digital pathology images play a crucial role in medical diagnostics, but their ultra-high resolution and large file sizes pose significant challenges for storage, transmission, and real-time visualization. To address these issues, we propose CLERIC, a novel deep learning-based image compression framework designed specifically for whole slide images (WSIs). CLERIC integrates a learnable lifting scheme and advanced convolutional techniques to enhance compression e ffi ciency while preserving critical pathological details. Our framework employs a lifting-scheme transform in the analysis stage to decompose images into low-and high-frequency components, enabling more structured latent representations. These components are processed through parallel encoders incorporating Deformable Residual Blocks (DRB) and Recurrent Residual Blocks (R2B) to improve feature extraction and spatial adaptability. The synthesis stage applies an inverse lifting transform for e ffective image reconstruction, ensuring high-fidelity restoration of fine-grained tissue structures. We evaluate CLERIC on a digital pathology image dataset and compare its performance against state-of-the-art learned image compression (LIC) models. Experimental results demonstrate that CLERIC achieves superior rate-distortion (RD) performance, significantly reducing storage requirements while maintaining high diagnostic image quality. Our study highlights the potential of deep learning-based compression in digital pathology, facilitating e fficient data management and long-term storage while ensuring seamless integration into clinical workflows and AI-assisted diagnostic systems. K eywords Learned Image Compression, Deep Learning, Wavelet Transform, Digital Pathology, Whole Slide Image. 1 I ntroduction Digital pathology images serve as fundamental data for various medical applications, playing a crucial role in cancer diagnosis, disease analysis, and treatment planning. These images are typically stored as Whole Slide Images (WSIs), which are characterized by ultra-high resolution (typically 0. 25µ m / px). A single uncompressed WSI can often exceed several gigabytes in size (e.g., 20-30 GB per image), posing significant challenges in terms of storage, transmission, and computational e ffi ciency.

artificial intelligence, deep learning, machine learning, (18 more...)

2503.23862

Country:

Asia > South Korea > Seoul > Seoul (0.24)
Europe > Hungary (0.04)
Europe > Germany (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Therapeutic Area > Oncology (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Mazouz, Alaa, Chaudhuri, Sumanta, Cagnanzzo, Marco, Mitrea, Mihai, Tartaglione, Enzo, Fiandrotti, Attilio

Lightweight Embedded FPGA Deployment of Learned Image Compression with Knowledge Distillation and Hybrid Quantization

arXiv.org Artificial IntelligenceMar-13-2025

Learnable Image Compression (LIC) has shown the potential to outperform standardized video codecs in RD efficiency, prompting the research for hardware-friendly implementations. Most existing LIC hardware implementations prioritize latency to RD-efficiency and through an extensive exploration of the hardware design space. We present a novel design paradigm where the burden of tuning the design for a specific hardware platform is shifted towards model dimensioning and without compromising on RD-efficiency. First, we design a framework for distilling a leaner student LIC model from a reference teacher: by tuning a single model hyperparameters, we can meet the constraints of different hardware platforms without a complex hardware design exploration. Second, we propose a hardware-friendly implementation of the Generalized Divisive Normalization - GDN activation that preserves RD efficiency even post parameter quantization. Third, we design a pipelined FPGA configuration which takes full advantage of available FPGA resources by leveraging parallel processing and optimizing resource allocation. Our experiments with a state of the art LIC model show that we outperform all existing FPGA implementations while performing very close to the original model.

compression, implementation, opération, (14 more...)

2503.04832

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > Italy > Piedmont > Turin Province > Turin (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Education (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Hardware (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Neural Information Processing SystemsOct-8-2024, 16:10:48 GMT

Joint Autoregressive and Hierarchical Priors for Learned Image Compression

compression performance, joint autoregressive, learned image compression, (2 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence (1.00)

Neural Information Processing SystemsOct-7-2024, 10:27:19 GMT

Reviews: Joint Autoregressive and Hierarchical Priors for Learned Image Compression

Summary This paper extends the autoencoder trained for compression of Balle et al. (2018) with a small autoregressive model. The autoencoder of Balle uses Gaussian scale mixtures (GSMs) for entropy encoding of coefficients, and encodes its latent variables as side information in the bit stream. Here, conditional Gaussian mixtures are used which additionally use neighboring coefficients as context. The authors find that this significantly improves compression performance. Good – Good performance (notably, state-of-the-art MS-SSIM results without optimizing directly on this metric) – Extensive supplementary materials, including rate-distortion curves for individual images – Well written Bad – Incremental, with no real conceptual contributions – Missing related work: There is a long history of conditional Gaussian mixture models for autoregressive modeling of images – including for entropy rate estimation – that is arguably more relevant than other generative models mentioned in the paper: Domke et al. (2008), Hosseini et al. (2010), Theis et al. (2012), Uria et al. (2013), Theis et al. (2015)

joint autoregressive, learned image compression, review, (2 more...)

Technology:

Information Technology > Artificial Intelligence (0.92)
Information Technology > Sensing and Signal Processing > Image Processing (0.85)

Spadaro, Gabriele, Presta, Alberto, Tartaglione, Enzo, Giraldo, Jhony H., Grangetto, Marco, Fiandrotti, Attilio

GABIC: Graph-based Attention Block for Image Compression

arXiv.org Artificial IntelligenceOct-3-2024

While standardized codecs like JPEG and HEVC-intra represent the industry standard in image compression, neural Learned Image Compression (LIC) codecs represent a promising alternative. In detail, integrating attention mechanisms from Vision Transformers into LIC models has shown improved compression efficiency. However, extra efficiency often comes at the cost of aggregating redundant features. This work proposes a Graph-based Attention Block for Image Compression (GABIC), a method to reduce feature redundancy based on a k-Nearest Neighbors enhanced attention mechanism. Our experiments show that GABIC outperforms comparable methods, particularly at high bit rates, enhancing compression performance.

artificial intelligence, compression, machine learning, (13 more...)

2410.02981

Country:

Europe > Italy > Piedmont > Turin Province > Turin (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.54)

Liebender, Christoph, Bezerra, Ranulfo, Ohno, Kazunori, Tadokoro, Satoshi

Region of Interest Loss for Anonymizing Learned Image Compression

arXiv.org Artificial IntelligenceJun-9-2024

The use of AI in public spaces continually raises concerns about privacy and the protection of sensitive data. An example is the deployment of detection and recognition methods on humans, where images are provided by surveillance cameras. This results in the acquisition of great amounts of sensitive data, since the capture and transmission of images taken by such cameras happens unaltered, for them to be received by a server on the network. However, many applications do not explicitly require the identity of a given person in a scene; An anonymized representation containing information of the person's position while preserving the context of them in the scene suffices. We show how using a customized loss function on region of interests (ROI) can achieve sufficient anonymization such that human faces become unrecognizable while persons are kept detectable, by training an end-to-end optimized autoencoder for learned image compression that utilizes the flexibility of the learned analysis and reconstruction transforms for the task of mutating parts of the compression result. This approach enables compression and anonymization in one step on the capture device, instead of transmitting sensitive, nonanonymized data over the network. Additionally, we evaluate how this anonymization impacts the average precision of pre-trained foundation models on detecting faces (MTCNN) and humans (YOLOv8) in comparison to non-ANN based methods, while considering compression rate and latency.

compression, image compression, precision, (12 more...)

2406.05726

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
Europe > Germany > Bavaria > Lower Franconia > Würzburg (0.04)
Asia > Japan > Honshū > Tōhoku (0.04)
North America > Puerto Rico > San Juan > San Juan (0.04)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (0.88)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceApr-15-2024

Compressible and Searchable: AI-native Multi-Modal Retrieval System with Learned Image Compression

Luo, Jixiang

The burgeoning volume of digital content across diverse modalities necessitates efficient storage and retrieval methods. Conventional approaches struggle to cope with the escalating complexity and scale of multimedia data. In this paper, we proposed framework addresses this challenge by fusing AI-native multi-modal search capabilities with neural image compression. First we analyze the intricate relationship between compressibility and searchability, recognizing the pivotal role each plays in the efficiency of storage and retrieval systems. Through the usage of simple adapter is to bridge the feature of Learned Image Compression(LIC) and Contrastive Language-Image Pretraining(CLIP) while retaining semantic fidelity and retrieval of multi-modal data. Experimental evaluations on Kodak datasets demonstrate the efficacy of our approach, showcasing significant enhancements in compression efficiency and search accuracy compared to existing methodologies. Our work marks a significant advancement towards scalable and efficient multi-modal search systems in the era of big data.

compression, image compression, retrieval, (14 more...)

2404.10234

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.40)

Industry: Semiconductors & Electronics (0.35)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)